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ABSTRACT 

We discuss two slightly counter-intuitive findings about the environmental dependence 
of clustering in the Sloan Digital Sky Survey. First, we find that the relation between 
clustering strength and density is not monotonic: galaxies in the densest regions are 
more strongly clustered than are galaxies in regions of moderate overdensity; galaxies 
in moderate overdensities are more strongly clustered than are those in moderate un- 
derdensities; but galaxies in moderate underdensities are less clustered than galaxies 
£^ ■ in the least dense regions. We argue that this is natural if clustering evolved gravita- 

| tionally from a Gaussian field, since the highest peaks and lowest troughs in Gaussian 

. fields are similarly clustered. The precise non-monotonic dependence of galaxy cluster- 

ing on density is very well reproduced in a mock catalog which is based on a halo-model 
decomposition of galaxy clustering. In the mock catalog, halos of different masses are 
all about 200 times denser than the critical density, and the dependence of small scale 
clustering on environment is entirely a consequence of the fact that the halo mass 
function in dense regions is top-heavy — another natural prediction of clustering from 
Gaussian initial conditions. 

Second, the distribution of galaxy counts in our sample is rather well described 
by a Poisson cluster model. We show that, despite their Poisson nature, correlations 
with environment are expected in such models. More remarkably, the expected trends 
are very like those in standard models of halo bias, despite the fact that correlations 
^ ■ with environment in these models arise purely from the fact that dense regions are 

dense because they happen to host more massive halos. This is in contrast to the usual 
analysis which assumes that it is the large scale environment which determines the 
halo mass function. 

Key words: methods: analytical - galaxies: formation - galaxies: haloes - dark matter 
- large scale structure of the universe 
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1 INTRODUCTION 

This is the third in a series of papers which study the envi- 
ronmental dependence of galaxy clustering. In hierarchical 
models, there is a correlation between fluctuations on dif- 
ferent scales. This induces correlations between halo mass 
and/or formation and the larger scale environment of a halo 
(Mo & White 1996; Sheth & Tormen 2002) which, in turn, 
induce correlations between galaxies and their environments 
(Sheth, Abbas & Skibba 2004; Abbas & Sheth 2005). Abbas 
& Sheth (2006) showed that halo bias — the correlation be- 
tween halo mass and environment — was able to account for 
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the environmental dependence of clustering in the Sloan Dig- 
ital Sky Survey (hereafter SDSS). In that study, a galaxy's 
environment was defined as the number of galaxies within 
8/i _1 Mpc, and only a relatively small range of environments 
were considered: the two-point correlation function £.(r\8) of 
galaxies in the densest third of the sample was shown to be 
about five times larger than that of the full sample, whereas 
the galaxies in the least dense 30% were less strongly clus- 
tered on scales larger than about 0.1/i _1 Mpc. 

In this paper we show that clustering strength is not 
a monotonic function of environment in the least dense re- 
gions: compared to the objects in the least dense 30% of the 
sample, the galaxies in the least dense 10% are more strongly 
clustered. Nevertheless, the statistical halo-bias based ef- 
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feet accounts very well for the observed non-monotonic re- 
lation between environment and clustering strength. Sec- 
tion 2 presents our measurements, shows that a halo-model 
(see Cooray & Sheth 2002 for a review) based mock cata- 
log exhibits the same features as seen in the data. Section 3 
discusses a simple model of the effect, and a final section 
summarizes our results and discusses some implications. 

An Appendix discusses a somewhat surprising interpre- 
tation of the origin of halo bias. This is motivated by the 
fact that two of the distributions which have routinely been 
found to provide a good description of galaxy counts in cells 
are the Thermodynamic or Generalized Poisson distribution 
(Saslaw & Hamilton 1984; Sheth 1995), and the Negative Bi- 
nomial distribution (Moran 1984). These distributions pro- 
vide a good description of the counts in our catalog as well; 
they are both examples of Poisson cluster models (Daley 
& Vere- Jones 2003). We show that, despite their Poisson 
nature, Poisson cluster models are expected to show envi- 
ronmental effects. However, in such models, dense regions 
are dense because they happen to host massive halos. The 
standard analysis of halo bias assumes that the large scale 
environment determines the halo mass function, rather than 
the other way around. Nevertheless, we show that the ex- 
pected halo bias in Poisson cluster models bears surprising 
similarity to the standard models of halo bias (Mo & White 
1996; Sheth & Tormen 2002). 

Throughout, we show results for a flat ACDM model 
for which (Q ,h,a 8 ) = (0.3,0.7,0.9) at z = 0. Here fi is 
the density in units of critical density today, h is the Hub- 
ble constant today in units of 100 km s _1 Mpc -1 , and ag 
describes the rms fluctuations of the initial field, evolved to 
the present time using linear theory, when smoothed with a 
tophat filter of radius 8/i _1 Mpc. The Very Large Simulation 
(VLS) we use to construct mock catalogs was made available 
to the public by the Virgo consortium. It was run with the 
same ACDM cosmology, and followed the evolution of 512 3 
particles in a cubic box with sides L = 479/i _1 Mpc (Yoshida 
et al. 2001). Dark matter halos were identified in this parti- 
cle distribution using the Friends-of-Friends method. Each 
halo has a mass, a position and a velocity. 



2 THE ENVIRONMENTAL DEPENDENCE OF 

To study the environmental dependence of clustering, we 
began with a parent galaxy catalog drawn from a par- 
ent catalog which was slightly larger than the SDSS DR4 
database, and volume limited to M r < —19.5. This catalog 
contains about 78, 000 galaxies with accurate angular posi- 
tions and redshifts; the associated comoving number density 
is 0.01 (/i^Mpc) -3 . 

The environment of each galaxy in this catalog was de- 
fined as the number of such galaxies Ng within 8/i _1 Mpc. 
No attempt was made to correct for redshift space dis- 
tortions, which, on these scales, should be relatively small 
(though not negligible; see discussion of Fig. 4 below). Fig- 
ure 1 shows the distribution of densities which results. We 
have constructed four subsamples of this catalog on the 
basis of environment as follows. The lowest density envi- 
ronments we probe use the 10% of the objects with the 
fewest neighbours within 8/i _1 Mpc. A slightly less severe 
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Figure 1. Distribution of galaxy overdensity in a volume limited 
catalog with M r < —19.5 drawn from the SDSS. The density was 
defined by counting galaxies within 8h~ x Mpc. Dashed lines show 
where the area under the curve equals 10%, 30%, 70% and 90% 
of the total. These correspond to Sg = -0.78, -0.522,0.196, and 
1.065. 



cut uses 30% rather than 10% of the objects. We then do 
the same for overdense regions: we select subsets contain- 
ing 10% and 30% of the objects having the most neighbours 
within 8/i -1 Mpc. Hashed regions indicate the various den- 
sity thresholds which these cuts imply. The actual overden- 
sity thresholds Sg = Ng/{Ng) — 1 are indicated in the upper 
right corner of the figure, with some abuse of notation: 5 n 
means n% of the galaxies were in lower density environments 
(i.e., p(< S n %) = n/100). In what follows, we use these limit- 
ing values to define a number of subsamples. The Appendix 
discusses the solid and dashed lines; these show two Poisson 
cluster models that are able to provide reasonable descrip- 
tions of the measurements. 

Filled circles in the right hand panel of Figure 2 show 
the projected correlation function of the full sample; this is 
computed by integrating £(r p ,7r) over < 7T < SS/i^Mpc. 
Error bars are from jack-knife resampling in which the statis- 
tics were remeasured after omitting a random region, and re- 
peated thirty times (approximately 1.5 times the total num- 
ber of bins in separation for the results presented, as in Ab- 
bas & Sheth 2006). Filled triangles show the corresponding 
measurement in the subsample which contains 30% of the 
galaxies chosen to lie in the densest regions. Filled squares 
show the clustering in a sample of the same size, but now 
drawn from the least dense regions. The open triangles and 
squares show the result of selecting only the densest and 
least dense 10%, rather than 30%, of the sample. 

A number of unusual features are worth noting. First, 
on small scales, the correlation functions of all the subsam- 
ples have larger amplitudes than that of the parent sample 
from which they are drawn. This is most easily understood 
by supposing that the full sample is divided into two halves, 
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Figure 2. Environmental dependence of clustering in the SDSS for galaxies volume-limited to M r < —19.5 (right) and in a halo-model 
based mock catalog (left). Filled circles show the clustering in the full sample; filled and open triangles show subsets containing 30% and 
10% of the objects classified as being in the densest regions; filled and open squares show similar measurements but in the least dense 
regions. 



say D and U, for dense and underdense. If the pair counts 
in the total sample are denoted TT, then the correlation 
function of the full sample is 1 + £tt = TT/RR, where RR 
denotes the counts in an unclustered distribution of the same 
number density. If we define £dd and £ U u similarly, then 

1 + ftt = TT/RR = (DD + UU + 2DU) /RR 

= (l+e<w)/4+(l + $„„)/4 + 2(l + ^„)/4. (1) 

However, on scales smaller than that on which the environ- 



ment was defined (8h Mpc in our case), DU « by defi- 
nition, so (_ du » -1. In this limit, £ tt = (£ dd + 6«u)/ 4 - 1/2, 
or 



Ud + tuu =2(1 + 26 



(2) 



Thus, it is possible that £dd and £ uu are both larger than 

Second, on small scales, £ for the sample of galaxies in 
less dense regions can be substantially larger than it is for 
galaxies in denser regions. Abbas & Sheth (2005) argue that 
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Figure 3. Same as the previous figure, but now the clustering signal is shown normalized to that for the full sample. Dashed lines in 
panel on the right show the locii traced out by the points shown in the panel on the left. 



this would arise if the average halo mass in underdense re- 
gions is smaller than it is in dense regions (see their Section 
2.2). Evidence that this is the case comes from the fact that 
£ shows a feature at ~ 0.3/i _1 Mpc for the underdense sam- 
ple, but not in the dense sample. Abbas & Sheth argue that 
this feature reflects the transition from pairs which are in the 
same halo to those which are in separate halos. In the un- 
derdense regions, there are few neighbouring haloes within 
8fa -1 Mpc (by definition), so this transition is obvious; since 
there may be many neighbouring halos in the dense regions, 
this transition is less obvious in the denser samples. 



A careful inspection suggests inflection points on scales 
of order 2/i _1 Mpc in the denser samples. If this is due to the 
same transition, then the radii of halos in the denser sam- 
ples are about (2/0.3) times larger than in the least dense 
sample. It is standard to assume that the halos in dense 
and underdense regions have the same virial densities, so 
our measurements suggest that the halos in dense regions 
are typically about (2/0. 3) 3 = 300 times more massive than 
those in the least dense sample. 

Finally, on larger scales where halo correlations are im- 
portant, Figure 2 shows that f (r\8) is strongest in the dens- 
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est regions. This is not unexpected in the context of the 
linear peaks-bias model of (Kaiser 1984), if, on average, the 
densest regions at the present time formed from the densest 
regions in the initial fluctuation field. This is because, in the 
initial Gaussian random field, the densest regions were more 
strongly clustered than regions of average density. Although 
the galaxies in less dense regions are less strongly clustered 
than those in the very dense regions, the Figure suggests 
that the least dense 10% are more strongly clustered than 
when the cut is 30%. Figure 3, which shows the ratio of 
£(r |<5)/£(r), shows this effect slightly more clearly. Although 
the difference on any given scale is only slightly larger than 
the error bars, it is in the same sense on all scales. (While 
the errors on the measured £ are correlated between bins, 
the correlation between neighbouring bins is not expected 
to be strong, and it is expected to decrease with bin sepa- 
ration. In any case, the next section shows that the effect 
is present with much larger statistical significance in mock 
catalogs of the effect.) 

2.1 Measurements in mock galaxy samples 

A more quantitative comparison between our model and the 
measurements is shown in the left hand panel of Figure 2. 
The panel shows measurements of the environmental de- 
pendence of clustering in a mock galaxy catalog which was 
constructed as described by Abbas & Sheth (2006) . In brief, 
we assigned mock 'galaxies' to halos in the VLS simulation 
by assuming that only halos more massive than a critical ttil 
may contain galaxies. The first galaxy in a halo is called the 
'central' galaxy. The number of other 'satellite' galaxies is 
drawn from a Poisson distribution with mean N a (m) where 



N s (m) = (m/mi)° if m > m,L- 



(3) 



We distribute the satellite galaxies in a halo around the halo 
centre so that the radial profile follows that of the dark 
matter (i.e., the galaxies are assumed to follow an NFW 



profile). We set m L = 10 11 ' 7b /i" 1 M Q , mi 



10 1 



'h-'Me, 



and a = 1.13; Zehavi et al. (2005) show that these choices 
are appropriate for this set of SDSS galaxies: M r < —19.5. 
The resulting catalog has about 10 6 mock galaxies, because 
the volume of our simulation is about ten times larger than 
that of our SDSS catalog. This means that we can measure 
the environmental effects in the mock with greater precision 
than in the data. 

The important point, which we note explicitly here, is 
the following: By assuming that equation (3) is the same 
function of m for all environments, and by assuming that 
the radial profile of the galaxies depends only on halo mass 
and not on environment, we have constructed a galaxy cat- 
alog in which all environmental effects are entirely a conse- 
quence of the correlation between halo mass and environ- 
ment. Therefore, the locii traced out by the various sets of 
symbols shown in Figure 2 represent the predicted environ- 
mental dependence of £ if there are no environmental effects 
other than the statistical one determined by the initial fluc- 
tuation field. The measurements in the mock catalog are 
extremely similar to those in the SDSS itself, leaving little 
room for additional environmental effects. 

In the mock catalog, the halo mass function in dense 
regions is top-heavy. This is illustrated in Figure 4, which 
shows the abundance of halos per logarithmic bin in mass, 



weighted by the number of galaxies in the mass bin. Sym- 
bols show this galaxy-weighted halo mass function for bins 
in environment, where the environment is defined by the 
number of neighbours 7V 8 in real space. Curves show the 
corresponding measurement when N$ is defined (as it is in 
the SDSS data) from redshift space positions. Although Fin- 
gers of God tend to scatter some galaxies in massive halos 
into less dense environments, notice that the mix of halos is 
shifted towards lower masses in the less dense regions. 

The precise mix of halos determines the amplitude of 
the correlation function on both large and small scales, so 
the agreement seen on all scales in Figures 2 and 3 suggests 
that the halo abundances shown in Figure 4 are representa- 
tive of those in the SDSS: massive halos preferentially popu- 
late dense regions. Indeed, our estimate of a factor of ~ 100 
difference in mass (based on Figure 2) appears to be in good 
agreement with the mass functions shown in Figure 4. 

The fact that the mock catalog accurately reproduces 
the inflection in £(r|<5) seen at ~ 0.3fe~ 1 Mpc in the under- 
dense regions has an interesting implication. Abbas & Sheth 
(2005) show that this inflection scale reflects the typical 
virial diameters of halos in these environments. Therefore, 
the agreement with the SDSS suggests that the mock cata- 
log has modeled the correlation between halo mass and virial 
radius accurately. Since modelers differ in what this density 
should be (200 times background density? 200 times critical 
density? some other multiple of background density?), the 
agreement is nontrivial. In the mock, halos are 200 times 
the critical density whatever their mass. If they were 200 
times the background density instead, they would be larger 
by a factor of fl^ w 3/2. As samples get larger, £(r|<5) 
will become more precisely measured, and so it may provide 
an interesting constraint on halo densities. 



3 LINEAR BIAS AND ENVIRONMENTAL 
DEPENDENCE OF CLUSTERING 

The previous section showed that the large scale clustering 
strength is not a monotonic function of environment. A sim- 
ple model of this effect follows from writing the linear peaks 
bias model in terms of the nonlinear density 8: 



S (S) 



Of, 



(4) 



(equation 5 of Kaiser 1984 with Sheth 1998b, c), where 5o{S) 
denotes the value of the density contrast in linear theory 
when the fully nonlinear overdensity is S, and as denotes 
the rms value of the linear fluctuation field when smoothed 
on a scale which contains mass pV(l + 5). For a power-law 
power spectrum, a\ — <7q/(1 + 6) 



(n + 3)/3 



8 C [i-(i + J)-V f c] 
a2(l + <5)-(»+3)/3 



(5) 



where our relation between So and <5 provides an excellent 
description of the spherical collapse model if we set S c w 
1.686. Note that, in our case, ao = as = 0.9. 

Equation (5) shows that B| is not a monotonic func- 
tion of S, nor is it symmetric around 5 = 0. Figure 5 shows 
this explicitly for n — —1.2 (since this would produce a cor- 
relation function with slope —1.8, which is approximately 
the slope we see for the full sample) and a few values of ao 
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Figure 4. Galaxy-weighted halo mass function as a function of environment in our mock catalog. Filled circles show this quantity 
when all galaxies in the mock catalog are included. The other sets of symbols show this quantity when only galaxies in specific bins 
in environment are used. These bins are the 10 percent of the objects with the fewest neighbours within 8/i _1 Mpc (Ng), the range 
between 10 and 30 percent, the range between 70 and 90 percent, and the 10 percent with the largest Ng. Empty squares, filled squares, 
filled triangles and empty triangles show results for these bins when Ng is defined using real space positions; dashed curves show the 
corresponding measurement when Ng is in redshift space. In real space, objects with the lowest Ng values populate the lowest mass halos. 
While this remains true in redshift space, there are a number of objects in massive halos which appear to inhabit less dense regions; 
these are galaxies in the Fingers of God of massive halos which reach into less dense regions. 



(appropriate for a scale of about 8/i~ 1 Mpc). While this sim- 
ple model is qualitatively consistent with our measurements, 
making a more quantitative statement is less straightfor- 
ward. 

The symbols show a very rough indication of the how 
the bias scales with environment in the previous figures. We 
caution that the locations of these points are not model free 
because 5 in the expression above is for the dark matter, 
whereas our environment is defined by the galaxy distribu- 
tion. To place the points on this plot, we have assumed that 
the galaxies in the total sample are unbiased (so 5 — <5 ga i 
and £ a ii (the filled circles in Figure 2) equals the correla- 
tion function of the dark matter which appears in the right 
hand side of equation (5). For each value of Sg in Figure 1 
we determined a 5 ga j from requiring that N[l + 5 ga i] = 



(1 + <5 8 )[1 + N(l + £)]; we used TV = 0.014tt8 3 /3 = 21.45 
and N£ = 1/0. 2 2 — 1 = 24, as suggested by the fact that 
equation (A9) with b — 0.8, when inserted in equation (A6), 
provides a good description of the distribution of Ng in the 
data (c.f. solid line in Figure 1). This procedure accounts 
for the fact that Sg is computed in cells centred on galaxies, 
whereas <5 ga i is not. (Equation A7 and associated discussion 
shows why this scaling is reasonable.) The associated bias 
factors for these points were read-off from the large scale val- 
ues of the ratio shown in Figure 3 (recall we are assuming 
that the full sample is unbiased relative to the dark matter). 

While the agreement is reassuring (e.g. this procedure 
correctly predicts very weak clustering for the sample which 
contains the 30% of the objects with lowest Ng), a, better 
analysis of the effects of bias is required to make this model 
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Figure 5. Bias as a function of environment. Curves show equa- 
tion (5) with n = —1.2 and three values of ag: _B| is symmetric 
about 5 = for small values of cto , but becomes increasingly asym- 
metric as (to increases. Symbols (same as in Figure 3) show the 
bias as a function of environment derived from Figure 3 following 
the procedure described in the text. 

more than simply illustrative. Our main point in the present 
work is to show that the clustering strength is not expected 
to be a monotonic function of environment. 



4 DISCUSSION AND CONCLUSIONS 

High peaks and low troughs in a Gaussian random field are 
similarly biased relative to the set of all fluctuations. Al- 
though nonlinear evolution destroys this symmetry, some 
remnants of it are expected to remain in the galaxy cluster- 
ing signal (equation 5 and Figure 5). We find that galaxies 
in less dense regions of the SDSS are less strongly clustered 
than galaxies in very dense regions, but that galaxies in the 
least dense regions are more strongly clustered than galaxies 
in regions of moderate underdensity (Figures 2 and 3). 

Our simple model for this effect (equation 5) makes a 
qualitative prediction which can soon be tested: namely, the 
strong clustering of underdense regions is more easily noticed 
when (jo <s£ 1. This is because B$ is more symmetric about 

5 = for small values of a\: Figure 5 shows this clearly. 
At fixed redshift, this means that the effect should be easier 
to notice for environments defined on larger scales. Alterna- 
tively, for environments defined on a fixed comoving scale, 
the fact that clustering is not a monotonic function of envi- 
ronment should be easier to notice at higher redshift. In our 
analysis of the SDSS, we had to go to extremely underdense 
environments before we found evidence that clustering was 
not monotonic with environment. Our model suggests that, 
at higher redshift, one need not go to as extreme values of 
5 to see the enhanced clustering of underdense regions. 

We also showed that a mock galaxy catalog, constructed 



to reproduce the clustering of the full sample, exhibits the 
same environmental dependent clustering signals as seen in 
the SDSS (Figures 2 and 3). This agreement is non-trivial — 
rather than being simple power laws with different ampli- 
tudes, the correlation functions in different environments 
exhibit inflections on various different scales. The agreement 
suggests that a number of features of the mock are also true 
of the Universe: namely, the mix of dark matter halo masses 
is top-heavy in dense regions (Figure 4), massive halos have 
larger virial radii, and the primary drivers of correlations 
between galaxy luminosity and environment are the corre- 
lations between galaxy luminosity and host halo mass, and 
the correlation between halo mass and environment. In par- 
ticular, there is little room for the additional effects which 
Sheth & Tormen (2004) show may also play a role in deter- 
mining these correlations (also see Gao et al. 2005; Harker 
et al. 2005; Wechsler et al. 2006; Zhu et al. 2006). 

Environmental effects are also present in Poisson clus- 
ter models (Appendix). In such models, halo bias is simply a 
consequence of mass conservation: whereas the origin of halo 
bias is usually stated as arising from the fact that dense re- 
gions host massive halos, in Poisson cluster models, dense re- 
gions are dense precisely because they happen to host dense 
halos. Nevertheless, these models predict halo bias relations 
which are surprisingly like those in the standard model of 
halo bias (Mo & White 1996; Sheth & Tormen 2002). The 
analysis in the Appendix is particularly interesting in view 
of the fact that two Poisson cluster models, the Thermody- 
namic or Generalized Poisson (Saslaw & Hamilton; Sheth 
1995) distribution and the Negative Binomial distribution, 
both provide good descriptions of our data (Figure 1, and 
also see recent analyses of the void probability function by 
Croton et al. 2004 and Conroy et al. 2005). 

All other cosmological parameters remaining fixed, 
larger values of the rms fluctuation amplitude as imply a 
larger range of environments. So the difference between the 
densest and least dense regions increases with increasing as- 
Therefore, one might expect the environmental dependence 
of clustering to yield useful information about as, although 
the results in Tinker et al. (2006) suggest otherwise. So it 
is interesting that the dashed lines in Figure 3 tend to lie 
slightly further from unity than do the SDSS measurements. 
Determining whether or not this is indicating that the data 
prefer a value of as which is lower than the value (as = 0.9) 
used in the mocks is the subject of work in progress. 
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Let p(N\V) denote the probability that a randomly placed 
cell (i.e. not necessarily centred on a particle) of volume V 
contains N particles, and define 



P(8\V) = ^2s N p{N\V). 



(Al) 



For a Compound Poisson distribution, 

lnP(a|V)=Jv[ff(«)-l] [0#(s)/0s| s =i]~\ (A2) 
where N is the mean of p(N\V), 



(A3) 



and h(n) denotes the probability that a randomly chosen 
cluster contains n particles (e.g. Daley & Vere Jones 2003). 

In the main text we defined the environment of a par- 
ticle (in that case a galaxy) by counting the number of par- 
ticles in a cell of volume V centred on it. Hence, we seek 
an expression for the probability q(N\V) that a cell of vol- 
ume V, centred on a randomly chosen particle, also contains 
N — 1 other particles. For Compound Poisson distributions, 
this distribution is simply related to p(N\V), the distribu- 
tion of counts in randomly placed cells. This is because 



q(N\V) = 



Y:Linh(n)p(N-n\V) _ 
£„>o nh (n) 



(A4) 



the terms involving h(n) denote the probability that V is 
centred on a particle in an n halo, and the term involving 
p denotes the probability that V contains N — n particles 
in addition to the n which are associated with the halo in 
which the chosen particle sits. Note that the term in the 
denominator above is dH(s)/ds evaluated at s = 1. In what 
follows, we will denote this quantity H'(s = 1). 

To derive an expression for q, it is convenient to begin 

with 



Q(s\V) = J2s N q(N\V) 



APPENDIX A: ENVIRONMENTAL EFFECTS 
IN POISSON CLUSTER MODELS 

The main text argued that the correlation function depends 
on environment primarily because the halo distribution is 
top-heavy in dense regions. While it is tempting to conclude 
that this derives from the fact that more massive halos are 
more strongly clustered, this is not the whole story. The 
following calculation illustrates that, even in Poisson cluster 
models, the distribution of halos depends on environment. 
This is a simple consequence of mass conservation: a region 
containing N particles may not host a halo with mass n > 
N. 

As the name suggests, Poisson cluster models are point 
distributions in which cluster centers are distributed at ran- 
dom (i.e., the distribution of clusters is Poisson); different 
models are distinguished by specifying the probability that a 
randomly selected cluster contains n galaxies. The clusters 
themselves are assumed to have zero size. Poisson cluster 
distributions are also sometimes called Compound Poisson 
(Daley & Vere-Jones 2003). 



nh(n)p(N -n\V) 



wis = i) 

JV>0 n=l v ' 

s n nh{n)s N - n p{N 
2-; #'( s = 1) 



N>0 n=l 



- E^T^y E s^p(N-n\V) 



N>n-1 



N ds Nds^ PV 1 ' 

= jr E Ns^piNlV). (A5) 



N>0 



Comparison of the first and last expressions shows that 



q(N\V) = ^p(N\V). 



(A6) 



Evidently, if one only considers volumes which are centred 
on particles, then the distribution of counts in such cells 
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is simply related to the distribution of counts in randomly 
placed volumes — the two distributions differ by one factor 
of N/N. 

This useful result follows from two assumptions: clus- 
ters have vanishingly small sizes, and they are uncorrelated 
with one-another. So long as we restrict attention to volumes 
V which are large compared to the virial radius of a typi- 
cal cluster (about 2 Mpc), the assumption that halos have 
negligible sizes should be reasonable. The neglect of cluster 
clustering in our Universe is less reasonable, but, as we show 
below, is still a useful approximation. 

Equation (A5) implies that N q , the mean of q is 



N a 



(N) , (N(N-l)) 



N 



N 



= 1 + AT (1+0 



(A7) 



where £ is the volume average of the two point correlation 
function. This makes intuitive sense: the mean count when 
centred on a particle is one plus N , plus a contribution from 
the fact that the particles are correlated. 



A2 The Generalized Poisson and Negative 
Binomial distributions 

In what follows, we use the cluster mass function associated 
with an initially Poisson distribution: 

(nb)"- 1 exp(-nfe) 



h(r 



where b = (1 + 5c)' 1 (A8) 



(Epstein 1983; Sheth 1995). Here <5 C is the critical density 
in the initial density fluctuation field that is required for 
collapse in the spherical model. Thus, 6 = initially, and it 
grows to b — > 1. If these clusters have a Poisson distribution, 
then 



p(N\V) 



N(l - b) 
~Nl 



[N(l~b)+Nb]' 



x exp[-AT(l - b) - Nb] 



(A9) 



is the Generalized Poisson distribution (e.g. Sheth 1998b). 
Here N is the average number of particles in a cell of size 
V: (AT) = N. This distribution, which is sometimes called 
the Thermodynamic distribution in the astrophysical liter- 
ature (Saslaw & Hamilton 1984; Sheth 1995), provides a 
reasonably good description of the counts of galaxies in ran- 
domly placed cells of size V (Hamilton, Saslaw & Thuan 
1985; Sheth, Mo & Saslaw 1994; Conroy et al. 2006). 

For this distribution, the variance is (A^ 2 ) — (A') 2 = 
A7(l-6) 2 so the mean value of q is N + 1/(1- b) 2 . The solid 
line in Figure 1 shows q(N\V) associated with equation (A9), 
where N = 0.014tt8 3 /3 = 21.45 and b = 0.8. It provides a 
good description of the measured distribution. 

The dashed line shows a similar analysis of the Negative 
Binomial distribution. This distribution has 



p(N\V) = 



(7 + JV-l)! 
N\ (7-1)! 



(A10) 



with 7 = N (1 — P)/f3. The Negative Binomial is a Com- 
pound Poisson distribution with 

P n /n 



h(n) = 



(All) 



The mean and variance of this distribution are A" and N/(l — 
(3), so we have set N = 21.45 as before, and f3 = 1 — (1 — b) 2 — 



0.96 (so the variance also matches that of the Generalized 
Poisson distribution and the data). 

Figure 1 suggests that the Generalized Poisson distri- 
bution provides a slightly better description of this dataset 
than does the Negative Binomial. We are not as interested 
in which provides a better fit, as we are in the fact that a 
Compound Poisson model appears to work so well. 



A3 Halo bias in Compound Poisson distributions 

Having made the connection between counts-in-cells and 
the cluster distribution, we now consider environmental ef- 
fects in Poisson cluster models. The mean density of n-halos 
which are surrounded by regions which contain N particles 
on the scale V is 



h(n\N) 



nh(n) p(N~n\V) 



Y,nh{n) p(N\V) 



(A12) 



The ratio of this to the average density of n-halos is 
p(N — n\V)/p(N\V). Since this ratio is obviously different 
from unity, the mass function of halos in dense regions is 
different from underdense regions. Typically, p(N\V) drops 
exponentially when N N: p(N\V) oc exp(— aN) with 
a > 0. Thus, p(N — n)/p(N) « exp(an): massive halos are 
exponentially more abundant in regions with large N. 

The following explicit calculation shows that the Pois- 
son cluster model actually captures much of the usual 
parametrization of the environmental dependence of halo 
abundances. That is, the following demonstrates that mass 
conservation itself provides a significant source of halo bias. 
Since mass conservation is most important on scales V where 
the typical mass in a cell is not substantially larger than the 
typical halo mass, we expect the Poisson cluster model to 
provide a reasonable approximation on such scales. (And re- 
call that Figure 1 shows that the Generalized Poisson and 
Negative Binomial distributions do indeed provide good de- 
scriptions of the data.) 



A4 Halo bias in the Generalized Poisson 
distribution 

If we define NB = N(l — b) + Nb (the reason for this nota- 
tion will become clear shortly) , then the density of n halos 
surrounded by regions of size V which contain N particles 
is 



h(n\N) = 



B__b 

BV 



N 



nb 
~NB 



1 - 



NB 



(A13) 



where we have used the fact that 1 — b/B — (NB — 
Nb) /NB = AT(1 - b)/[N(l - b) + Nb}. This form is pre- 
cisely that of the conditional mass function of n halos in the 
Poisson model (Sheth 1995, 2003). This is remarkable for 
the following reason. 

The usual estimate of the environmental dependence of 
halo abundances uses the conditional mass function, and it 
uses the spherical evolution model to transform the density 
N/N to an initial overdensity (Mo & White 1996; Sheth 
& Tormen 2002). To see what the transformation is in the 
present case, set B — (l + 5o) _1 (this is motivated by the 
fact that b = (1 + 5 C )- X . Then B = N(l — b)/N + b, so 
1 + <$<) = (1 + 6 C )(1 + 5) /[5 C + 1 + 6]. Hence, in a model 
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in which the halos in the unconditional mass function are 
assumed to have a Poisson spatial distribution, the mapping 
between linear and nonlinear density is 

<*0 = z T-fT, FT i ? ~ $c r. (A14) 

l+6 c /(l + 6) 1 + 8 1 + 5 

This relation is qualitatively like that of the spherical model, 
which is very well approximated by 

1+6* (l-6 /S c y Sc . (A15) 

See Sheth ( 1998b, c) for more discussion of the similarities 
between this and the spherical model, and another deriva- 
tion of the relation between equations (A8) and (A9). This 
qualitative similarity, and the fact that the Generalized Pois- 
son model appears to describe the data in Figure 1 reason- 
ably well, both suggest that Poisson cluster models may pro- 
vide useful insight into the origin of environmental effects. 

In these models, environmental effects arise not because 
dense regions host the most massive halos, but because a 
region which contains a massive halo tends to be denser 
than average, simply because of mass conservation. This is 
a rather different view of environmental effects than that of 
Mo & White (1996) and Sheth & Tormen (2002), where the 
large scale environment, rather than mass conservation, is 
seen as the primary driver of the correlation between halo 
mass and environment! 

The formation histories of halos in Poisson cluster mod- 
els have been studied in Sheth (1998a). By combining that 
analysis with the present one, it should be interesting and 
straightforward to see what correlations between formation 
history and environment are built into such models. In ad- 
dition, it would also be interesting to repeat this analysis for 
the Negative Binomial distribution. 



